Gold Medal Software 3

home *** CD-ROM | disk | FTP | other *** search

/ Gold Medal Software 3 / Gold Medal Software - Volume 3 (Gold Medal) (1994).iso / archive / cx201e.arj / CXSUB.DOC < prev next >

Wrap

Text File | 1994-03-01 | 12KB | 318 lines

CXSUB routines -------------------------------------------------------------------------- As you know, Cx provides a very low level interface to data compression. Many application designers, however, may be able to use a higher level interface. The CXSUB routines provide a high level, application independent interface to Cx data compression. The CXSUB routines have been carefully designed to allow easy integration into existing applications. You may be able to use the CXSUB routines in your applications, but if not, they may be instructive in explaining the usage of Cx. The Source Code -------------------------------------------------------------------------- Source code for the CXSUB routines is found in the files: CXSUB.C - C source code CXSUB.H - C header file CXSUB.PAS - Turbo Pascal source code VBCXSUB.BAS - Visual BASIC source code Programming Interface -------------------------------------------------------------------------- CXSUB Error Codes ------------------------------------------------------------------ CXSUB_ERR_OPENS - Could not open source. CXSUB_ERR_OPEND - Could not open destination. CXSUB_ERR_NOMEM - Insufficient memory. CXSUB_ERR_READ - Could not read from source. CXSUB_ERR_WRITE - Could not write to destination. CXSUB_ERR_CLOSE - Could not close destination. CXSUB_ERR_INVALID - source file is invalid or corrupt cx_error_message(error) ------------------------------------------------------------------ PURPOSE: Return an English error string from a Cx or CXSUB error. PARAMETER: error - error code (CX_ERR* or CXSUB_ERR*) RETURN: An English error message, or "unknown" if the error code is unknown. cx_compress_file(dst, src, method, bsize, tsize) ------------------------------------------------------------------ PURPOSE: Compress any size or type of file to another file. PARAMETERS: dst - destination file name src - source file name method - Compression method (CX_METHOD*) bsize - compression buffer size (1-CX_MAX_BUFFER) tsize - temporary buffer size (CX_C_MINTEMP-CX_D_MINTEMP) RETURN: CX_ERR_* - Cx error. CXSUB_ERR_* - CXSUB error. 0 - No error. NOTES: For maximum compression specify bsize and tsize as large as possible. See section 'CXSUB Single File Compression' for more information. cx_decompress_file(dst, src) ------------------------------------------------------------------ PURPOSE: Decompress a file compressed with cx_compress_file. PARAMETERS: dst - destination file name src - source file name RETURN: CX_ERR_* - Cx error. CXSUB_ERR_* - CXSUB error. 0 - No error. NOTES: If dst is not specified (NULL in C, '' in Pascal, "" in Visual BASIC), an integrity check only will be performed. See section 'CXSUB Single File Compression' for more information. cx_compress_ofile(ofile, ifile, method, bsize, tsize) ------------------------------------------------------------------ PURPOSE: Compress any size or type of file to another file, with files previously opened. PARAMETERS: ofile - opened output file ifile - opened input file method - Compression method (CX_METHOD*) bsize - compression buffer size (1-CX_MAX_BUFFER) tsize - temporary buffer size (CX_C_MINTEMP-CX_D_MINTEMP) RETURN: CX_ERR_* - Cx error. CXSUB_ERR_* - CXSUB error. 0 - No error. NOTES: For maximum compression specify bsize and tsize as large as possible. See section 'CXSUB Single File Compression' for more information. cx_decompress_ofile(dst, src) ------------------------------------------------------------------ PURPOSE: Decompress a file compressed with cx_compress_(o)file, with files previously opened. PARAMETERS: ofile - opened output file ifile - opened input file RETURN: CX_ERR_* - Cx error. CXSUB_ERR_* - CXSUB error. 0 - No error. NOTES: See section 'CXSUB Single File Compression' for more information. CXSUB Single File Compression (SFC) -------------------------------------------------------------------------- This section contains general and language specific information about the following CXSUB functions: cx_compress_file - file name interface cx_decompress_file - file name interface cx_compress_hfile - file handle interface cx_decompress_hfile - file handle interface Overview --------------------------------------------------------------------- The CXSUB Single File Compression (SFC) routines provide an easy way to compress and decompress one file to another. There are two interfaces. One is based on file names. Using this interface is not much harder than specifying: "Compress file A to file B" or "Decompress file B to file C" Of course, the decompression routine will only work on files compressed with the compression routine. The other interface is based on file handles. A file handle is simply a way to reference an open file. This interface is provided to allow for future routines based on the SFC routines. It is possible, for example, to design an archive file format that uses the handle based interface. All of the provided SFC source code writes and reads the same file format. File Format --------------------------------------------------------------------- The file format is a sequence of variable length 'blocks'. Blocks are produced by reading data from a file to be compressed. The amount of data read in each pass is known here as the 'original buffer size' or BSIZE. If, for example, you are compressing a 1000 bytes file, and BSIZE is 100 bytes, 10 blocks will be produced. BSIZE is a parameter to the file compression routines (parameter bsize). A block has 4 pieces of information: 2 bytes - original buffer size (BSIZE) 2 bytes - compressed buffer size (CSIZE) 2 bytes - 16 bit CRC (from CX_CRC) (DATACRC) CSIZE bytes - (DATA) The relation between these 4 pieces of information is: if BSIZE is the same as CSIZE, the original buffer could not be compressed. DATA contains uncompressed data. if BSIZE is not the same as CSIZE, the original buffer was successfully compressed. CSIZE will be strictly less than BSIZE. DATA contains compressed data. DATACRC is a 16 bit CRC computed on DATA. Note that this means DATACRC is computed on compressed data. To indicate the end of a compressed file, an abbreviated block is stored. The abbreviated block is simply: 2 bytes - original buffer size (0) As an example, compressing a 25 byte file with BSIZE equal to 10, where: bytes 0...9 compress to 7 bytes bytes 10..19 can't be compressed bytes 20..25 compress to 2 bytes The file data produced from the SFC compression routines will be: ------------------------------ 2 bytes - 10 block 1 2 bytes - 7 2 bytes - DATACRC 7 bytes - compressed data ------------------------------ 2 bytes - 10 block 2 2 bytes - 10 2 bytes - DATACRC 10 bytes - uncompressed data ------------------------------ 2 bytes - 5 block 3 2 bytes - 2 2 bytes - DATACRC 2 bytes - compressed data ------------------------------ 2 bytes - 0 block 4, Abbreviated end of file block Of course, you would typically use a BSIZE much larger than 10 bytes. For maximum compression, you would use a BSIZE of CX_MAX_BUFFER. Motivation / Questions / Expanding or Improving the SFC routines --------------------------------------------------------------------- The following questions and answers may provide insight into the SFC functions. Q: Why are bsize and tsize parameters? For maximum compression, bsize should always be CX_MAX_BUFFER and tsize should always be CX_C_MAXTEMP. A: Some applications want or need to minimize memory usage. By keeping bsize and tsize parameters, the application can balance memory usage and compression size. Q: Why are both BSIZE and CSIZE stored? A: By storing both, it is possible to handle uncompressable data. If BSIZE is equal to CSIZE, the stored buffer is known to be uncompressed. Q: Why is a CRC computed on the compressed buffer as opposed to the original buffer? A: Testing has determined that a CRC on compressed buffers is better able to detect errors than a CRC on original buffers. In addition, as compressed buffers are typically smaller than original buffers, a CRC on a compressed buffer is quicker to compute. Q: Why is the last block abbreviated? A: Simply to save space. By abbreviating the final block, it is possible to save 4 bytes of storage for each compressed file. Note, however, that this is a fairly arbitrary decision. As file I/O calls consume time, it may be desirable to store a 'complete' block. This would eliminate up to 2 file I/O calls per block when decompressing. Q: Why isn't the compression method stored? A: CX_DECOMPRESS can decompress any buffer compressed with CX_COMPRESS without knowing beforehand the specific compression method used. Q: What if I wanted to store an original files time stamp and/or name in a compressed file? A: This would be a fairly easy addition. You could add a header to the SFC file format. As an example: 4 bytes - files time stamp 1 byte - name length (NAMELEN) NAMELEN bytes - file name The cx_compress_file and cx_decompress_file functions could be modified to write, read and use this header information. The only additional routines you would have to call (included in most languages) are for reading and writing a files time stamp. Q: What if I wanted to extract valid data from a corrupt compressed file? A: This could be accomplished by expanding a block. Instead of: 2 bytes - original buffer size (BSIZE) 2 bytes - compressed buffer size (CSIZE) 2 bytes - 16 bit CRC (from CX_CRC) (DATACRC) CSIZE bytes - (DATA) You could specify: 4 bytes - header like '$CX$' 4 bytes - physical file location (POS) 2 bytes - original buffer size (BSIZE) 2 bytes - compressed buffer size (CSIZE) 2 bytes - 16 bit CRC (from CX_CRC) (DATACRC) CSIZE bytes - (DATA) With a corrupt file, you could search the file for the block header ($CX$). After finding a header, you would have all the information you need to extract a valid original buffer. If there are errors when decompressing a block, you would know it is invalid. Note that smaller BSIZE's will have more potential for recovery as each block will effect less data.